A High-Resolution Imaging Method for Multiple-Input Multiple-Output Sonar Based on Deterministic Compressed Sensing

Differences between conventional sonar and Multiple-Input Multiple-Output (MIMO) sonar systems arise in achieving high angular and range resolution. MIMO sonar uses Matched Filtering (MF) with well-correlated transmitted signals to enhance spatial resolution by obtaining virtual arrays. However, imperfect correlation characteristics yield high sidelobe values, which hinder accurate target localization in underwater imagery. To address this, a Compressed Sensing (CS) method is proposed by reconstructing echo signals to suppress correlation noise between orthogonal waveforms. A shifted dictionary matrix and a deterministic Discrete Fourier Transform (DFT) measurement matrix are used to multiply received echo signals to yield compressed measurements. A sparse recovery algorithm is applied to optimize signal reconstruction before joint transmit–receive beamforming forms a 2D sonar image in the angle-range domain. Numerical simulations and lake experimental results confirm the effectiveness of the proposed method, by obtaining a lower sidelobe sonar image under sub-Nyquist sampling rates as compared with other approaches.


Introduction
The concept of Multiple-Input Multiple-Output (MIMO) systems originated in the field of communications.Expanding its applicability to sonar detection systems, the MIMO concept in conjunction with the idea of diversity gain, has facilitated the development of various MIMO detection systems, including single-site, dual-site, and hybrid multi-site configurations.The application of MIMO technology has yielded significant advancements in radar and sonar array signal processing, particularly in the domain of small target detection [1][2][3].The core principles of MIMO technology lie in waveform diversity and spatial diversity [4,5], which are used in dense MIMO sonar and distributed MIMO sonar systems, respectively.Distributed MIMO systems offer spatial diversity for achieving stable target detection and tracking.A notable application of distributed MIMO sonar systems in enhancing target detection capabilities is demonstrated by Vossen et al. in their work on estimating underwater geological features [6].Dense MIMO sonar systems, on the other hand, use simultaneous orthogonal waveform transmission at the transmitter and use Matched Filtering (MF) techniques at the receiver to process echo signals.Waveform diversity enhances the virtual aperture of the receiving array and improves the performance of underwater imaging [7][8][9].The equivalence between MIMO sonar and virtual Single-Input Multiple-Output (SIMO) sonar counterparts can be satisfied only when transmitted signals demonstrate favorable correlation properties [10].The integration of MIMO technology into the domain of sonar imaging offers potential advantages, such as reducing the number of required array elements and simplifying system hardware.
Sensors 2024, 24, 1296 2 of 18 In the context of two-dimensional MIMO imaging sonar, a virtual array with a larger aperture is formed by placing two transmitters at opposite ends of multiple receivers.This arrangement is intended to improve the spatial resolution of sonar images [11,12].To ensure coherent summation of echoes within the receiver, it is essential that independently transmitted orthogonal waveforms utilize identical bandwidth resources.Hao He et al. conducted waveform design for the transmission beamforming of wideband signals in MIMO sonar systems [13].In addition, the presence of Fano-and electromagnetically induced transparency (EIT)-type transmission line shapes have been observed to produce high-quality (Q) factors, which are useful for high-sensitivity sensing in the development of sensors [14].Currently, Fano resonances and EIT have demonstrated good resonance response in metamaterial units and square lattice plasma nanostructures [15,16].Compared with the CS method, its low computational complexity reduces the cost of practical application.
Disregarding the Doppler frequency shift in the echoes, the MF procedure at the receiver can leverage correlation processing between the transmitted signals and echoes [9].The calculated autocorrelation functions (ACFs) and cross-correlation functions (CCFs) are added by incorporating delay and attenuation adjustments for various targets.Achieving a high main lobe to sidelobe ratio in the range dimension becomes challenging because of high-level inter-waveform cross-correlation.For Linear Frequency Modulation (LFM, or Chirp) waveforms, the Time-Bandwidth Product (TBP) of the frequency sweep can be adjusted to modify the ratio between CCF and ACF levels, effectively reducing the impact of sidelobes.As the length of the encoded waveforms increases, the level of sidelobes decreases.However, a concomitant increase in the length of the receivers leads to a substantial surge in computational requirements, rendering it difficult to apply in the field of sonar imaging [17].Li Jian et al. presented the design of an optimal receiver filter with significantly reduced range sidelobes [8].But the complex iterative computation poses certain limitations on the inherent reduction in sidelobes in imaging results.Multiple orthogonal waveforms and separation methods have been proposed in the field of imaging for MIMO-SAR [18], but their practical application in sonar systems is challenging.Although considerable progress has been made in improving the detection performance of MIMO sonar systems [19,20], research on suppressing sidelobe levels in improving sonar imaging remains limited.
Compressive Sensing (CS) offers the capability to subsample sparse signals [21], enabling compression of signals during observation at sampling rates much lower than the Nyquist rate.By collecting samples from the compressed signal, the target range can be directly estimated from the sparse domain, allowing for target detection in the received echo signals with reduced measurement requirements [22].In this study, an effective method based on CS to mitigate sidelobes interference in MIMO sonar imaging is proposed.In noise-free conditions, the output of CS exhibits no autocorrelation sidelobes.In the presence of noise, it is hypothesized that CS can also mitigate the inter-waveform cross-correlation levels.This enables the detection of two closely spaced targets with higher resolution compared with conventional MF [23,24].
The rest of this paper is organized as follows.Section 2 introduces the method adopted in our study.Section 3 describes the numerical simulation and experimental results, which are analyzed in subsections.Finally, in Section 4, we conclude this article with a summary.

MIMO Sonar Array Layout
Considering the coordinate origin as the center, a MIMO sonar array comprises a transmitting uniform linear array (ULA) with Mt elements and a receiving ULA with Nr elements.The inter-element spacing between the transmitting sensors and receiving sensors is denoted as d t and d r , respectively, both equivalent to half-wavelength intervals.By approximating the array equivalent phase center, the virtual array is constructed to expand the receiving array.The specific combination of the transmitting and receiving Sensors 2024, 24, 1296 3 of 18 arrays gives rise to different MIMO array configurations, each characterized by a distinct virtual array representation.To maximize the effective aperture of the MIMO system, it is imperative to ensure that the virtual elements remain suitably separated [9].The spacing constraints between d t and d r in the MIMO array normally satisfy d t = N r d r .
The MIMO array is strategically arranged to achieve optimal azimuthal resolution while minimizing the physical size.Assuming an array configuration and sensor count, the physical size of the MIMO array is denoted as (M t − 1)N r d r .The effective aperture DM of the MIMO array is equivalent to the aperture of the virtual SIMO sonar, given by (M t N r − 1)d r .The SIMO array effective aperture D S is identical to its physical size.When the physical dimensions of the MIMO and SIMO arrays are equal, the relationship between their effective apertures can be expressed as To ensure a satisfactory level of angular resolution, it is common practice to assign a sufficiently large number of receiving sensors N r to the MIMO array.The ratio of effective apertures is directly influenced by the number of transmitting elements M t .As the number of transmitting elements M t increases in the MIMO array, the effective aperture of the array gradually converges towards its own physical size.A reduction in the efficiency of aperture expansion at the receiving end is inevitable.Remarkably, when the number of transmitting elements M t is set to 2, the MIMO array achieves its optimal configuration with the maximum effective aperture, surpassing that of other MIMO arrays.
Figure 1 shows an extended MIMO imaging sonar system with two transmitter subarrays positioned at the ends of the receiving ULA in engineering applications.The transmission of multiple orthogonal waveforms enables the derivation of analytical coordinates with 2N r virtual array elements, which is predicted by the virtual array expansion theory.This advantageous configuration facilitates improved angular resolution and enhances the imaging performance of the sonar system.
virtual array representation.To maximize the effective aperture of the MIMO system, it is imperative to ensure that the virtual elements remain suitably separated [9].The spacing constraints between dt and dr in the MIMO array normally satisfy . The MIMO array is strategically arranged to achieve optimal azimuthal resolution while minimizing the physical size.Assuming an array configuration and sensor count the physical size of the MIMO array is denoted as ( ) The effective aperture DM of the MIMO array is equivalent to the aperture of the virtual SIMO sonar, given by ( ) The SIMO array effective aperture DS is identical to its physical size.When the physical dimensions of the MIMO and SIMO arrays are equal, the relationship be tween their effective apertures can be expressed as To ensure a satisfactory level of angular resolution, it is common practice to assign a sufficiently large number of receiving sensors Nr to the MIMO array.The ratio of effective apertures is directly influenced by the number of transmitting elements Mt.As the numbe of transmitting elements Mt increases in the MIMO array, the effective aperture of the ar ray gradually converges towards its own physical size.A reduction in the efficiency o aperture expansion at the receiving end is inevitable.Remarkably, when the number o transmitting elements Mt is set to 2, the MIMO array achieves its optimal configuration with the maximum effective aperture, surpassing that of other MIMO arrays.
Figure 1 shows an extended MIMO imaging sonar system with two transmitte subarrays positioned at the ends of the receiving ULA in engineering applications.The transmission of multiple orthogonal waveforms enables the derivation of analytical coor dinates with 2Nr virtual array elements, which is predicted by the virtual array expansion theory.This advantageous configuration facilitates improved angular resolution and en hances the imaging performance of the sonar system.

Signal Model
Suppose that two transmitting sensors simultaneously emit narrowband LFM pulses with the same pulse duration (T) and bandwidth (B).The frequency sweep rate denoted by = / K B T remains constant for both pulses but differs in the modulation direction Considering the relatively low velocity between the MIMO sonar and the target, the

Signal Model
Suppose that two transmitting sensors simultaneously emit narrowband LFM pulses with the same pulse duration (T) and bandwidth (B).The frequency sweep rate denoted by K = B/T remains constant for both pulses but differs in the modulation direction.Considering the relatively low velocity between the MIMO sonar and the target, the influence of the Doppler frequency shift can be disregarded.To simplify the analysis, we assume the target to be an ideal scatter, neglecting any medium absorption and transmission losses associated with free-space propagation [25].Under the assumption of far-field conditions, we consider a single target located at (P 0 , 0), where the echo signals reflected from the target and received by Nr receiving sensors can be assumed to be fully correlated.
Mathematically, this can be expressed as the baseband sum of the received echo signal at the nth receiver: where σ P 0 m denotes the scattering coefficient of the target for the mth waveform, τ tm represents the time delay of sound propagation from the mth transmitting sensor to the target scatter, τ rn denotes the time delay from the target to the nth receiving sensor, and η 0 (t) represents the Gaussian white noise at the nth receiving sensor, which is uncorrelated with the two transmitting LFM pulses.
Traditional methods in MIMO imaging sonar use multiple sets of matched filters for the purpose of separating superposed echo signals [15].These filters are utilized to generate MIMO sonar images by applying digital beamforming to both the transmitter and receiver, contributing to the acquisition of multiple beams.Each filter's impulse response function corresponds to a specific transmitted waveform, which can be expressed as where [•] * denotes conjugation.Specifically, by applying the mth matched filter to the echo signal received at the nth sensing element, the resulting filtered signal can be expressed as The noise component within the output of the filter is typically insignificant.The calculated autocorrelation functions (ACFs) of the mth transmitted waveform are denoted as R m,m , while R m,i represents the cross-correlation functions (CCFs) between the mth and ith transmitted waveforms.This can be disregarded when its magnitude is sufficiently lower than the ACF.The quantitative analysis of signal ACFs and CCFs is discussed in [12].The same frequency band and pulse width ensure their ACF exhibit identical characteristics.The expressions for the ACF and CCF are as follows: The cross-correlation noise of LFM waveforms is directly influenced by the Time-Bandwidth Product (TBP), while in the practical applications of sonar systems, limitations exist that prevent it from being infinitely large.LFM waveforms with a large TBP can mitigate the high sidelobe levels observed in the range dimension when using the MF method.The ratio between the maximum absolute values of the CCF and ACF is approximately From Figure 2, it is evident that a larger TBP effectively reduces the levels of the CCF.For a given bandwidth, the ratio ρ decreases as the pulse width T increases, and vice versa.To provide a concrete example, consider a pair of LFM pulses with B = 40 kHz and T = 40 ms, where the CCF level is suppressed to approximately −35 dB.When B = 60 kHz, it is only when T ≥ 90 ms that the ratio drops to −40 dB.Nevertheless, shorter pulses are Sensors 2024, 24, 1296 5 of 18 typically used for target detection in practical applications.Increasing both B and T would result in higher system costs and increased hardware complexity.
For a given bandwidth, the ratio ρ decreases as the pulse width T increases, an To provide a concrete example, consider a pair of LFM pulses with B = 40 kH ms, where the CCF level is suppressed to approximately −35 dB.When B = only when T ≥ 90 ms that the ratio drops to −40 dB.Nevertheless, shorter pu cally used for target detection in practical applications.Increasing both B result in higher system costs and increased hardware complexity.In MIMO sonar systems, obtaining perfectly orthogonal waveforms is n practical applications.The degradation of image quality poses a challenge to performance of sonar systems.Numerous scholars have used frequency phase-encoded waveforms to achieve the design of orthogonal waveforms.such waveforms include frequency-hopped LFM waveforms [26], polyphas code sequences signals [27], and gold sequences signals [28]. Figure 3 presen of the ACF and CCF by utilizing different transmission waveforms implemen and window functions.In Figure 3a, up-and down-chirp waveforms are bandwidth of 40 kHz and a chirp duration of 20 ms. Figure 3b utilizes a CDM with LFM sub-pulses as described in [29], consisting of 100 subcodes and duration of 20 ms.In Figure 3c, an orthogonal polyphase-coded waveform is turing a sub-pulse width of 0.04 ms, 512 subcodes, and 8 random phase v Figure 3d   In MIMO sonar systems, obtaining perfectly orthogonal waveforms is not feasible in practical applications.The degradation of image quality poses a challenge to the detection performance of sonar systems.Numerous scholars have used frequency-encoded or phaseencoded waveforms to achieve the design of orthogonal waveforms.Examples of such waveforms include frequency-hopped LFM waveforms [26], polyphase orthogonal code sequences signals [27], and gold sequences signals [28]. Figure 3 presents the results of the ACF and CCF by utilizing different transmission waveforms implemented with MF and window functions.In Figure 3a, up-and down-chirp waveforms are used, with a bandwidth of 40 kHz and a chirp duration of 20 ms. Figure 3b utilizes a CDMA waveform with LFM sub-pulses as described in [29], consisting of 100 subcodes and a total pulse duration of 20 ms.In Figure 3c, an orthogonal polyphase-coded waveform is utilized, featuring a sub-pulse width of 0.04 ms, 512 subcodes, and 8 random phase values.Lastly, Figure 3d displays a gold sequence signal constructed from a 10-stage m-sequence, with a total code length of 1024 and a time width of 20 ms.
Through comparison, it is evident that the sidelobe values of the ACF for the encoded signals (solid blue line) are significantly lower than those of the LFM waveforms.The correlation values between the up-and down-chirp waveforms exhibit considerable reduction, while the CCF between the encoded signals (dashed black line) is slightly higher in comparison.Additionally, the utilization of a Chebyshev window with LFM waveforms effectively reduces the sidelobe levels in the ACF (dashed red line).However, it does not improve (may even exacerbate) the sidelobe values of the ACF for the encoded signals.This limitation hampers their suitability as transmitted signals in MIMO sonar systems.The −3 dB main lobe widths of these waveforms are all below the millisecond level.While the encoded signals offer improved autocorrelation in practical applications, their correlation remains unimproved, and their hardware system complexity is higher in sonar systems.Hence, up-and down-LFM waveforms are generally chosen as the preferred transmitted waveform.

The Procedure of the Proposed Method
CS replaces the requirements of high-resolution sampling and data compression by combining the two steps into a single low-resolution acquisition step [30].The successful recovery of the original signal using CS requires the fulfillment of two key conditions: sparsity and incoherence.A special measurement matrix that satisfies the restricted isometric property (RIP) [31,32] must be used.CS exploits the fact that a small set of nonadaptive linear measurements of a compressible signal carries enough information for reconstruction and processing [33].The transmitted signal shifted dictionary matrix is simply generated to make the echo signal sparse representation.A computationally fast and efficient DFT deterministic matrix is constructed, and deterministic sub-sampling introduces randomness in a deterministic way so that the matrix does not need to be stored for reconstruction purposes.Due to the sampler's deterministic concept, proposed deterministic construction may achieve some compression ratio and less computational complexity [30], leading to a reduction in samples than that required by random matrices.Also, orthonormalizing the measurement matrix makes it become mutually incoherent with any dictionary; thus, recovery is possible with high probability.CS technology allows for a significant reduction in the sampling rate far below the Nyquist rate [34].It proves to be a promising approach for detecting Ultra-Wideband signals at reduced sampling rates, providing that the signals exhibit a sparse representation in a specific spatial domain.By exploiting only a few samples acquired from the echoes, CS fulfills effective target detection and estimation of the target distance from the transmitter.

The Procedure of the Proposed Method
CS replaces the requirements of high-resolution sampling and data compression by combining the two steps into a single low-resolution acquisition step [30].The successful recovery of the original signal using CS requires the fulfillment of two key conditions: sparsity and incoherence.A special measurement matrix that satisfies the restricted isometric property (RIP) [31,32] must be used.CS exploits the fact that a small set of nonadaptive linear measurements of a compressible signal carries enough information for reconstruction and processing [33].The transmitted signal shifted dictionary matrix is simply generated to make the echo signal sparse representation.A computationally fast and efficient DFT deterministic matrix is constructed, and deterministic sub-sampling introduces randomness in a deterministic way so that the matrix does not need to be stored for reconstruction purposes.Due to the sampler's deterministic concept, proposed deterministic construction may achieve some compression ratio and less computational complexity [30], leading to a reduction in samples than that required by random matrices.Also, orthonormalizing the measurement matrix makes it become mutually incoherent with any dictionary; thus, recovery is possible with high probability.CS technology allows for a significant reduction in the sampling rate far below the Nyquist rate [34].It proves to be a promising approach for detecting Ultra-Wideband signals at reduced sampling rates, providing that the signals exhibit a sparse representation in a specific spatial domain.By exploiting only a few samples acquired from the echoes, CS fulfills effective target detection and estimation of the target distance from the transmitter.
A flow chart of the proposed method is depicted in Figure 4, illustrating the joint transmit-receive beamforming implements increased angular resolution in the form of a multi-beam image.The dictionary matrix ensures the sparsity of the signal, while the measurement matrix guarantees the efficacy of signal reconstruction.These two matrices Sensors 2024, 24, 1296 7 of 18 collectively form the sensing matrix as a pivotal constituent in the algorithm.By reconstructing the target echo signals, the algorithm effectively suppresses the inter-correlation noise among the received orthogonal waveforms, leading to a substantial reduction in sidelobe interference of sonar image.The use of an under-sampling rate is conducive to mitigate computational complexity, making it feasible for practical applications without significantly impacting storage requirements.
A flow chart of the proposed method is depicted in Figure 4, illustrating the joint transmit-receive beamforming implements increased angular resolution in the form of a multi-beam image.The dictionary matrix ensures the sparsity of the signal, while the measurement matrix guarantees the efficacy of signal reconstruction.These two matrices collectively form the sensing matrix as a pivotal constituent in the algorithm.By reconstructing the target echo signals, the algorithm effectively suppresses the inter-correlation noise among the received orthogonal waveforms, leading to a substantial reduction in sidelobe interference of sonar image.The use of an under-sampling rate is conducive to mitigate computational complexity, making it feasible for practical applications without significantly impacting storage requirements.A joint transmit-receive beamforming is performed on the echo signals in order to form beams that encompass the target scattering points within the desired angular range.
Supposing the kth beam output is represented by , where each beam output has a signal length of L, the beam outputs can be expressed in matrix form as: A dictionary matrix is used for the purpose of sparse representation of the echo signals, capitalizing on the favorable intercorrelation existing between the up and down frequency signals.The echo signals predominantly manifest significant peaks at the positions corresponding to the autocorrelation while considering near-zero values at other positions.The sparsity level of the signal is mainly determined by the number of targets contained in the echo signals.Supposing the measurement sample is sufficient, the dictionary matrix ensures the number of distinguished targets.The dictionary matrix Ψ can be ex- pressed in a specific manner as follows: where ( ) denotes the transmit signal with element ( ) denotes the shifted version signals with elements ( ) ( ) . L and NT represent the number of samples in the echo signal and transmit signal, respectively.Assuming X and Y denote the echo signal and the sparse signal, the sparse representation process can be formulated as follows: A joint transmit-receive beamforming is performed on the echo signals in order to form beams that encompass the target scattering points within the desired angular range.Supposing the kth beam output is represented by B k (t) ∈ C L×1 , where each beam output has a signal length of L, the beam outputs can be expressed in matrix form as: A dictionary matrix is used for the purpose of sparse representation of the echo signals, capitalizing on the favorable intercorrelation existing between the up and down frequency signals.The echo signals predominantly manifest significant peaks at the positions corresponding to the autocorrelation while considering near-zero values at other positions.The sparsity level of the signal is mainly determined by the number of targets contained in the echo signals.Supposing the measurement sample is sufficient, the dictionary matrix ensures the number of distinguished targets.The dictionary matrix Ψ can be expressed in a specific manner as follows: where S T (t l ) denotes the transmit signal with element (s 1 , L and N T represent the number of samples in the echo signal and transmit signal, respectively.Assuming X and Y denote the echo signal and the sparse signal, the sparse representation process can be formulated as follows: Subsequently, the proposed method is utilized to acquire observation samples for each beam, with orthogonal normalization ensuring their efficacy.To alleviate computational complexity, the first M rows of the observation matrix Φ = e − i2πkj n , k, j = 0, 1, . . .n − 1 are selected.Φ can be mathematically expressed as: where Φ m ∈ C 1×L ; the kth beam is multiplied element-wise with the mth row of the measurement matrix to yield the observation samples for the kth direction: The representation of the observation samples for the entire beam output can be expressed as: The process of generating observation samples can be represented in matrix form as: According to the analysis and derivation in [35], the dictionary matrix Ψ and the measurement matrix Φ constitute the sensing matrix: To obtain the optimal sparse solution, we seek the solution to the underdetermined linear equations by using Equation ( 17): Figure 5 shows the procedure of the optimization algorithm.To retrieve the target information, an optimization algorithm was used to optimize sparse signals, such as Basis Pursuit Denoising (BPDN) and Least Absolute Shrinkage and Selection Operator (LASSO) provided by the Sparse LAB package [24].They are equivalent methods but were developed by different research communities.

Results and Discussion
This section mainly describes the performance of the proposed method compared with MF [12], MF with weighting, and the Richardson-Lucy (R-L) algorithm, also known as the Deconvolution (Dcv) algorithm [36], which are evaluated by analyzing simulation results and experimental data.

Numerical Simulation
To illustrate the effectiveness of the proposed method, we compare the peak sidelobe level (PSL) and the −3 dB main lobe width of the different methods.In addition, we us Equations ( 18) and ( 19) denote the objective functions BPDN and LASSO.respectively.Where ε = mσ 2 , and σ 2 denotes the noise variance in Equation (18).Equation ( 19) was solved by the Stepwise regression False Discovery Rate (SWr-FDR) algorithm based on the absolute size of their t-statistic up to some preset significance threshold based on the False Discovery Rate (FDR).

Results and Discussion
This section mainly describes the performance of the proposed method compared with MF [12], MF with weighting, and the Richardson-Lucy (R-L) algorithm, also known as the Deconvolution (Dcv) algorithm [36], which are evaluated by analyzing simulation results and experimental data.

Numerical Simulation
To illustrate the effectiveness of the proposed method, we compare the peak sidelobe level (PSL) and the −3 dB main lobe width of the different methods.In addition, we use the normalized root mean square error (RMSE) as the metric to analyze the influence of observed samples' quantity on the signal reconstruction process of the algorithm.Considering the target's coefficient is 1, both up-and down-chirp waveforms were simultaneously emitted from two distinct transmitting elements.The numerical simulation parameters are shown in Table 1.PSL is used as an evaluation index to assess the suppression of cross-correlation noise by output superposition signals, as shown by Equation ( 20) where ∼ z (t) represents the output signals, SL represents the level value of the sidelobe area, and max(•) represents the maximum value.
The evaluation metric used to compare the relative reconstruction errors of various measurement matrices under noisy conditions was the RMSE, as calculated according to the Formula (21).
where z(t) represents the real signal and ∼ z (t) represents the approximate sparse signal obtained by the recovery algorithm.

PSL and Main Lobe Width of the Output Results
Figure 6 presents the outputs computed by the application of MF with and without the windows function, as well as the outputs derived from the application of the Dcv algorithm and the proposed method of overlapping up-and down-LFM waveforms.As shown in Table 2, a quantitative analysis was conducted by using the PSL and −3 dB main lobe width as performance metrics.A −40 dB Chebyshev window was used to effectively reduce ACF sidelobe levels, but the CCF values between the two orthogonal waveforms worsened.The deconvolution method demonstrates better sidelobe suppression at −36.82 dB, while engineering standards typically necessitate levels below −40 dB to receive a better image.Remarkably, the CS method not only achieves exceedingly lower PSL but also exhibits a narrower −3 dB main lobe width in comparison with the alternative approaches.
worsened.The deconvolution method demonstrates better sidelobe suppression at −36 dB, while engineering standards typically necessitate levels below −40 dB to receive a b ter image.Remarkably, the CS method not only exceedingly lower PSL but a exhibits a narrower −3 dB main lobe width in comparison with the alternative approach  The investigation encompasses a comparison of the PSL results along the range mension, considering different values of the TBP and background noisy environmen The analysis was based on averaging the results from 100 Monte Carlo simulations.Figu 7a presents a comparative depiction of the peak sidelobe levels achieved by the MF, M with weighting, Dcv, and CS methods for different TBPs.Evidently, the PSL of the p posed method is around −45 dB and surpasses the performance of the other metho while exhibiting a minimal dependence on the TBP with effective reconstruction of targ echoes from overlapped multiple waveforms.Figure 7b shows that the proposed meth evidently outperforms the other methods for the output signal's PSL with the SNR ran from −10 to 30 dB.Even in a low SNR environment, CS proves the ability to effectiv suppress cross-correlation noise between signals, resulting in PSL levels below −27 dB is worth noting that the output performance at a low SNR level is greatly influenced noise variance.CS leverages the selection of the first m rows of the measurement mat to determine the number of observed samples that impact the PSL when the SNR is he constant.The investigation encompasses a comparison of the PSL results along the range dimension, considering different values of the TBP and background noisy environments.The analysis was based on averaging the results from 100 Monte Carlo simulations.Figure 7a presents a comparative depiction of the peak sidelobe levels achieved by the MF, MF with weighting, Dcv, and CS methods for different TBPs.Evidently, the PSL of the proposed method is around −45 dB and surpasses the performance of the other methods while exhibiting a minimal dependence on the TBP with effective reconstruction of target echoes from overlapped multiple waveforms.Figure 7b shows that the proposed method evidently outperforms the other methods for the output signal's PSL with the SNR range from −10 to 30 dB.Even in a low SNR environment, CS proves the ability to effectively suppress crosscorrelation noise between signals, resulting in PSL levels below −27 dB.It is worth noting that the output performance at a low SNR level is greatly influenced by noise variance.CS leverages the selection of the first m rows of the measurement matrix to determine the number of observed samples that impact the PSL when the SNR is held constant.

Imaging Results
Figure 9 illustrates the imaging and range profiles for single-target scenario both the MF and CS methods.MF exhibits relatively high sidelobe levels in the ra mension.In contrast, CS continually optimizes the sparse signal under sub-Nyqui pling conditions, leading to significantly lower sidelobe levels and superior perfor Even in scenarios where the targets in close proximity cause interference, the pr method accurately determines the location of the target.

Imaging Results
Figure 9 illustrates the imaging and range profiles for single-target scenarios using both the MF and CS methods.MF exhibits relatively high sidelobe levels in the range dimension.In contrast, CS continually optimizes the sparse signal under sub-Nyquist sampling conditions, leading to significantly lower sidelobe levels and superior performance.Even in scenarios where the targets in close proximity cause interference, the proposed method accurately determines the location of the target.Figure 10 presents the two-dimensional sonar images of multiple targets.When using MF for processing overlapping echoes, sidelobe levels severely degrade target detection performance.Figure 10a shows the sonar image by transmitting a single waveform, while Figure 10b shows the image received by transmitting up and down chirp signals.Although the window function reduces the autocorrelation sidelobe value, it deteriorates the cross-correlation level, as shown in Figure 10c.Figure 10d shows that a large range of sidelobe levels of about −30 dB can be obtained in the distance dimension by the Dcv method, and the width of the main lobe narrows.Figure 10e shows the image results obtained by using the CS method.The cross-correlation noise between different waveforms is well suppressed when only using about 12% of the samples, which has certain advantages in reducing the sidelobe levels in the distance dimension of the multi-beam image.In Figure 10f, the position of the target at 15 m is clearly visible compared with the Figure 10 presents the two-dimensional sonar images of multiple targets.When using MF for processing overlapping echoes, sidelobe levels severely degrade target detection performance.Figure 10a shows the sonar image by transmitting a single waveform, while Figure 10b shows the image received by transmitting up and down chirp signals.Although the window function reduces the autocorrelation sidelobe value, it deteriorates the crosscorrelation level, as shown in Figure 10c.Figure 10d shows that a large range of sidelobe levels of about −30 dB can be obtained in the distance dimension by the Dcv method, and the width of the main lobe narrows.Figure 10e shows the image results obtained by Sensors 2024, 24, 1296 12 of 18 using the CS method.The cross-correlation noise between different waveforms is well suppressed when only using about 12% of the samples, which has certain advantages in reducing the sidelobe levels in the distance dimension of the multi-beam image.In Figure 10f, the position of the target at 15 m is clearly visible compared with the other methods.The range sidelobe level can reach −40 dB or lower, and there is no interference from other sidelobe levels; thus, it is conducive to the detection probability of the target.In addition, the utilization of a virtual aperture leads to an increased azimuthal resolution in sonar imaging compared with the scenario involving a single transmitter, as depicted in Figure 10a.By adopting large aperture arrays, better-quality sonar images can be acquired.
Figure 10b shows the image received by transmitting up and down chirp signals.Although the window function reduces the autocorrelation sidelobe value, it deteriorates the cross-correlation level, as shown in Figure 10c.Figure 10d shows that a large range of sidelobe levels of about −30 dB can be obtained in the distance dimension by the Dcv method, and the width of the main lobe narrows.Figure 10e shows the image results obtained by using the CS method.The cross-correlation noise between different waveforms is well suppressed when only using about 12% of the samples, which has certain advantages in reducing the sidelobe levels in the distance dimension of the multi-beam image.In Figure 10f, the position of the target at 15 m is clearly visible compared with the other methods.The range sidelobe level can reach −40 dB or lower, and there is no interference from other sidelobe levels; thus, it is conducive to the detection probability of the target.In addition, the utilization of a virtual aperture leads to an increased azimuthal resolution in sonar imaging compared with the scenario involving a single transmitter, as depicted in Figure 10a.By adopting large aperture arrays, better-quality sonar images can be acquired.The quantity of observed samples directly influences the signal reconstructio cess of the algorithm.Therefore, a comprehensive analysis was conducted usin Monte Carlo experiments to examine the impact of different SNRs on factors such PSL, reconstruction error, and algorithm execution time.In Figure 11, the averag exhibits a consistent decreasing trend with increasing SNR.A larger number of me ment samples for the utilized DFT matrix facilitates improved signal reconstructio low SNR, a higher quantity of observed samples is required to realize accurate sig covery without disturbance of correlated noise.
In Figure 12a, the observed trend indicates decreasing error with an increasing ber of observed samples.After orthonormalization deterministic subsampling elim partial stochasticity, leading to substantial reductions in reconstruction errors com In Figure 12a, the observed trend indicates decreasing error with an increasing number of observed samples.After orthonormalization deterministic subsampling eliminates partial stochasticity, leading to substantial reductions in reconstruction errors compared to other matrices, the deterministic DFT matrix certifies a remarkable ability to accurately recover signals with a smaller number of samples.Figure 12b provides the runtime for signal recovery using different measurement matrices under varying numbers of observed samples.As expected, the time required for signal recovery increases with an increasing number of observed samples.In the realm of full-deterministic matrices, the DFT matrix outperforms the DCT matrix in terms of recovery time.The differences are typically on the order of seconds.It gives relative efficiency in both signal recovery accuracy and time performance.

Experimental Data Processing
To assess the effectiveness of the proposed method in practical application, a field experiment was conducted in a lake with an approximate depth of 50 m as shown in Figure 13.The array arranged for the experiment consisted of two transmitters and 192 receiver hydrophones in ULAs, with the transmitters parallel to the receivers and positioned at the ends of the receiver ULAs.The sonar wet-end was vertically fixed underwater at a depth of approximately 3 m using a long pole.Two steel spherical targets were suspended underwater at a depth of 7.5 m using cables, and the bottom target was a hollow steel

Experimental Data Processing
To assess the effectiveness of the proposed method in practical application, a field experiment was conducted in a lake with an approximate depth of 50 m as shown in Figure 13.The array arranged for the experiment consisted of two transmitters and 192 receiver hydrophones in ULAs, with the transmitters parallel to the receivers and Sensors 2024, 24, 1296 14 of 18 positioned at the ends of the receiver ULAs.The sonar wet-end was vertically fixed underwater at a depth of approximately 3 m using a long pole.Two steel spherical targets were suspended underwater at a depth of 7.5 m using cables, and the bottom target was a hollow steel cylinder.The measured distance from the sonar array to the targets was approximately 16.5 m, accounting for potential measurement errors.LFM pulses were applied to the two sides of transmitters, and the 2D imaging results processed by different methods were compared in the case of single targets and multi-targets.

Experimental Data Processing
To assess the effectiveness of the proposed method in practical application, a field experiment was conducted in a lake with an approximate depth of 50 m as shown in Fig ure 13.The array arranged for the experiment consisted of two transmitters and 192 re ceiver hydrophones in ULAs, with the transmitters parallel to the receivers and positioned at the ends of the receiver ULAs.The sonar wet-end was vertically fixed underwater at depth of approximately 3 m using a long pole.Two steel spherical targets were suspended underwater at a depth of 7.5 m using cables, and the bottom target was a hollow stee cylinder.The measured distance from the sonar array to the targets was approximately 16.5 m, accounting for potential measurement errors.LFM pulses were applied to the two sides of transmitters, and the 2D imaging results processed by different methods wer compared in the case of single targets and multi-targets.

Imaging Results of Different Methods
The outcome of the lake experiment is depicted in Figure 14.The presence of clutter is observed in certain specific regions, predominantly arising from the leakage of energy in the sidelobes of the scattering regions.The two targets are situated at angles o

Imaging Results of Different Methods
The outcome of the lake experiment is depicted in Figure 14.The presence of clutters is observed in certain specific regions, predominantly arising from the leakage of energy in the sidelobes of the scattering regions.The two targets are situated at angles of approximately 6.8 • and 9.2 • , respectively, while the scatter located at −2 • is attributed to the phenomenon of underwater reverberation.By using MF and weighted MF, as shown in Figure 14a,b, the resolution of the targets in terms of distance dimensions is limited due to the presence of higher sidelobe levels resulting from imperfect orthogonality among the received signals.Consequently, the sonar image does not exhibit perfect fidelity.Figure 14c displays the application of the Dcv method to process the echo signals in the range dimension.Nevertheless, some interference in the form of false peak values persists and thus adversely affects the estimation of target positions.In Figure 14d, the proposed method handles all sidelobes as false targets for iterative optimization, and the image resolution is greatly improved by iterative optimization of multi-beam data with a limited number of observed samples.
The results of 2D acoustic images of the bottom target in the lake experiment are displayed in Figure 15, where the target points are marked with solid white circles.Floaters marked on the lake surface ensure the location of bottom targets, and some uneven flat rocks can cause bottom disturbance.After weighting in Figure 15b, the cross-correlation sidelobe with aliasing the level of sidelobe near the main lobe is not effectively suppressed.Compared with Figure 15c,d, the proposed method effectively reduces the sidelobe interference better than Dcv, which has a larger dynamic range of −40 dB, and simultaneously realizes the high resolution of multi-beam image distance.The clutters can also be eliminated in some local regions of the target scene as noise to improve the judgment of the target location within the area of interest.
ure 14c displays the application of the Dcv method to process the echo signals in the range dimension.Nevertheless, some interference in the form of false peak values persists and thus adversely affects the estimation of target positions.In Figure 14d, the proposed method handles all sidelobes as false targets for iterative optimization, and the image resolution is greatly improved by iterative optimization of multi-beam data with a limited number of observed samples.The results of 2D acoustic images of the bottom target in the lake experiment are displayed in Figure 15, where the target points are marked with solid white circles.Floaters marked on the lake surface ensure the location of bottom targets, and some uneven flat rocks can cause bottom disturbance.After weighting in Figure 15b, the cross-correlation sidelobe with aliasing the level of sidelobe near the main lobe is not effectively suppressed.Compared with Figure 15c,d, the proposed method effectively reduces the sidelobe interference better than Dcv, which has a larger dynamic range of −40 dB, and simultaneously realizes the high resolution of multi-beam image distance.The clutters can also be eliminated in some local regions of the target scene as noise to improve the judgment of the target location within the area of interest.

Performance Analysis
Figure 16a,b presents the projection results in the distance dimensions for the target.CS shows a distinct advantage by providing a clear visualization of the peak positions for the targets, devoid of any sidelobe interference.In contrast, the traditional MF approach struggles to differentiate between closely spaced targets at similar distances.The persistence of sidelobe levels poses a challenge, as weaker targets may be obscured by the sidelobes.More importantly, it is crucial to select the number of iterations in the deconvolution process to avoid amplifying sidelobe levels due to approximation mismatch.Typically, using 10 or 20 iterations is recommended when implementing the R-L algorithm.In contrast, CS optimizes the target signal iteratively considering the noise level, and increasing the number of iterations does not lead to a deterioration in the sidelobe levels.

Performance Analysis
Figure 16a,b presents the projection results in the distance dimensions for the target.CS shows a distinct advantage by providing a clear visualization of the peak positions for the targets, devoid of any sidelobe interference.In contrast, the traditional MF approach struggles to differentiate between closely spaced targets at similar distances.The persistence of sidelobe levels poses a challenge, as weaker targets may be obscured by the sidelobes.More importantly, it is crucial to select the number of iterations in the deconvolution process to avoid amplifying sidelobe levels due to approximation mismatch.Typically, using 10 or 20 iterations is recommended when implementing the R-L algorithm.In contrast, CS optimizes the target signal iteratively considering the noise level, and increasing the number of iterations does not lead to a deterioration in the sidelobe levels.Furthermore, a comparison was conducted regarding the PSL and −3 dB main lobe width in the distance dimension among the different methods.Table 3 shows the PSL and −3 dB main lobe width of two kinds of targets.Compared with outputs produced by MF and Dcv, the proposed method results in much lower sidelobe levels, enabling a clearer identification of the targets.Additionally, the computed −3 dB main lobe width of the proposed method is approximately 0.02~0.03m, which is lower than the other methods.The current study assumes a far-field model rather than practical scenarios involving near-field environments.In such cases, conventional beamforming encounters challenges due to sidelobe leakage throughout the scanning angle, including the presence of clutter in the range dimension.The proposed method handles all these sidelobes as false targets for iterative optimization, and consequently, it offers a lower PSL in terms of both target localization and sidelobe suppression.The beam sidelobe levels of the two kinds of targets have a dynamic range over −40 dB, but the interference around the two closer spherical targets is larger because of reverberation or clutters.Future investigations will also consider additional factors such as array errors and directional uncertainties in transducers to further validate the algorithm's stability and robustness.

Figure 1 .
Figure 1.Equivalent procedure of the MIMO virtual array.

Figure 1 .
Figure 1.Equivalent procedure of the MIMO virtual array.

Figure 2 .
Figure 2. The ratio ρ at different pulse widths T and bandwidths B.
displays a gold sequence signal constructed from a 10-stage m-se a total code length of 1024 and a time width of 20 ms.Through comparison, it is evident that the sidelobe values of the ACF fo signals (solid blue line) are significantly lower than those of the LFM wavefo relation values between the up-and down-chirp waveforms exhibit consid tion, while the CCF between the encoded signals (dashed black line) is sligh comparison.Additionally, the utilization of a Chebyshev window with LFM effectively reduces the sidelobe levels in the ACF (dashed red line).Howeve improve (may even exacerbate) the sidelobe values of the ACF for the enc This limitation hampers their suitability as transmitted signals in MIMO so The −3 dB main lobe widths of these waveforms are all below the millisecond the encoded signals offer improved autocorrelation in practical application lation remains unimproved, and their hardware system complexity is hig

Figure 2 .
Figure 2. The ratio ρ at different pulse widths T and bandwidths B.

Figure 3 .
Figure 3.Comparison of the ACF and CCF in various transmitted signals: (a) chirp waveforms, (b) frequency-hopped chirp waveforms, (c) polyphase orthogonal code sequence waveforms, and (d) gold sequence waveforms.

Figure 3 .
Figure 3.Comparison of the ACF and CCF in various transmitted signals: (a) chirp waveforms, (b) frequency-hopped chirp waveforms, (c) polyphase orthogonal code sequence waveforms, and (d) gold sequence waveforms.

Figure 4 .
Figure 4.The processing procedure of the proposed method.

Figure 4 .
Figure 4.The processing procedure of the proposed method.

Figure 5 .
Figure 5.The procedure of the optimization algorithm.

Figure 5 .
Figure 5.The procedure of the optimization algorithm.

Figure 6 .
Figure 6.The output results with the different methods under SNR = 30 dB.

Figure 6 .
Figure 6.The output results with the different methods under SNR = 30 dB.

Figure 7 .
Figure 7. (a) PSLs of the output signals of the transmitted signals at different TBPs.(b) PSLs of the output signals at different environmental SNRs.

Figure 8
Figure 8 illustrates the relationship between the −3 dB main lobe width of the output signals in the range dimension and the SNR levels ranging from −10 to 30 dB.As the SNR increases, there is a slight reduction in the main lobe.The adoption of wideband transmitted signals ensures a high precision level within 0.001 m.In the case of the MF technique, the main lobe typically ranges from approximately 0.015 m to 0.016 m.Although window functions effectively mitigate sidelobe levels, they contribute to a wider main lobe, reach-

Figure 7 .
Figure 7. (a) PSLs of the output signals of the transmitted signals at different TBPs.(b) PSLs of the output signals at different environmental SNRs.

Figure 8
Figure 8 illustrates the relationship between the −3 dB main lobe width of the output signals in the range dimension and the SNR levels ranging from −10 to 30 dB.As the

Figure 8
Figure 8 illustrates the relationship between the −3 dB main lobe width of the signals in the range dimension and the SNR levels ranging from −10 to 30 dB.As th increases, there is a slight reduction in the main lobe.The adoption of wideband tra ted signals ensures a high precision level within 0.001 m.In the case of the MF tech the main lobe typically ranges from approximately 0.015 m to 0.016 m.Although w functions effectively mitigate sidelobe levels, they contribute to a wider main lobe ing approximately 0.023 m.Both the CS and Dcv methods are classified as iterativ mization techniques, and the output signals can facilitate an improved determina target positions.Moreover, the main lobe widths achieved by the CS method are a imately one-tenth of those calculated by MF.

Figure 8 .
Figure 8.The −3dB main lobe widths of the output signals at different SNRs.

Figure 8 .
Figure 8.The −3 dB main lobe widths of the output signals at different SNRs.

Figure 9 .
Figure 9. Single target of a 2D image: (a) MF and (b) CS.

Figure 9 .
Figure 9. Single target of a 2D image: (a) MF and (b) CS.

Figure 10 .
Figure 10.Different methods for multiple targets of a 2D sonar image and the range dimension curve: (a) SIMO with MF, (b) MIMO with MF, (c) MIMO with weighted MF, (d) MIMO with Dcv, (e) MIMO with CS, and (f) a comparison of the results in range dimension.

Figure 10 .
Figure 10.Different methods for multiple targets of a 2D sonar image and the range dimension curve: (a) SIMO with MF, (b) MIMO with MF, (c) MIMO with weighted MF, (d) MIMO with Dcv, (e) MIMO with CS, and (f) a comparison of the results in range dimension.3.1.3.The Effect of the Measurement Samples quantity of observed samples directly influences the signal reconstruction process of the algorithm.Therefore, a comprehensive analysis was conducted using 100 Monte

Figure 10 .
Figure 10.Different methods for multiple targets of a 2D sonar image and the range dim curve: (a) SIMO with MF, (b) MIMO with MF, (c) MIMO with weighted MF, (d) MIMO wi (e) MIMO with CS, and (f) a comparison of the results in range dimension.

Figure 11 .
Figure 11.The PSL of the outputs varies with the number of observed samples under differen

Figure 11 .
Figure 11.The PSL of the outputs varies with the number of observed samples under different SNRs.
Sensors 2024, 24, x FOR PEER REVIEW 14 of19 to other matrices, the deterministic DFT matrix certifies a remarkable ability to accurately recover signals with a smaller number of samples.Figure12bprovides the runtime for signal recovery using different measurement matrices under varying numbers of observed samples.As expected, the time required for signal recovery increases with an increasing number of observed samples.In the realm of full-deterministic matrices, the DFT matrix outperforms the DCT matrix in terms of recovery time.The differences are typically on the order of seconds.It gives relative efficiency in both signal recovery accuracy and time performance.

Figure 12 .
Figure 12.(a) RMSE of different observation matrices under different observation sample quantities.(b) The recovery time of different observation matrices under different observation sample quantities.

Figure 12 .
Figure 12.(a) RMSE of different observation matrices under different observation sample quantities.(b) The recovery time of different observation matrices under different observation sample quantities.

Figure 13 .
Figure 13.Diagram of the lake experiment scene.

Figure 13 .
Figure 13.Diagram of the lake experiment scene.

Figure 14 .
Figure 14.Lake experiment results of ball targets: (a) matched filtering, (b) matched filtering with weighting, (c) the deconvolution method, and (d) the proposed method.

Figure 14 .
Figure 14.Lake experiment results of ball targets: (a) matched filtering, (b) matched filtering with weighting, (c) the deconvolution method, and (d) the proposed method.

Figure 15 .
Figure 15.Lake experiment results of a sinking target: (a) matched filtering, (b) matched filtering with weighting, (c) the deconvolution method, and (d) the proposed method.3.2.2.Performance Analysis Figure 16a,b presents the projection results in the distance dimensions for the target.CS shows a distinct advantage by providing a clear visualization of the peak positions for the targets, devoid of any sidelobe interference.In contrast, the traditional MF approach

Figure 15 .
Figure 15.Lake experiment results of a sinking target: (a) matched filtering, (b) matched filtering with weighting, (c) the deconvolution method, and (d) the proposed method.

Figure 15 .
Figure 15.Lake experiment results of a sinking target: (a) matched filtering, (b) matched filtering with weighting, (c) the deconvolution method, and (d) the proposed method.

Figure 16 .
Figure 16.Comparison of the results of different methods in range and angle dimensions: (a) ball targets and (b) a sinking target.

Figure 16 .
Figure 16.Comparison of the results of different methods in range and angle dimensions: (a) ball targets and (b) a sinking target.

Table 2 .
PSL and −3 dB main lobe width of the output results.

Table 2 .
PSL and −3 dB main lobe width of the output results.

Table 3 .
PSL and −3 dB main lobe width of the two kinds of targets.